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Abstract: [Objective] Real-time monitoring of cow ruminant behavior is of paramount importance for promptly obtaining 
relevant information about cow health and predicting cow diseases. Currently, various strategies have been proposed for 
monitoring cow ruminant behavior, including video surveillance, sound recognition, and sensor monitoring methods. How- 
ever, the application of edge device gives rise to the issue of inadequate real-time performance. To reduce the volume of 
data transmission and cloud computing workload while achieving real-time monitoring of dairy cow rumination behavior, 
a real-time monitoring method was proposed for cow ruminant behavior based on edge computing. [Methods] Autono- 
mously designed edge devices were utilized to collect and process six-axis acceleration signals from cows in real-time. 
Based on these six-axis data, two distinct strategies, federated edge intelligence and split edge intelligence, were investigat- 
ed for the real-time recognition of cow ruminant behavior. Focused on the real-time recognition method for cow ruminant 
behavior leveraging federated edge intelligence, the CA-MobileNet v3 network was proposed by enhancing the MobileNet 
v3 network with a collaborative attention mechanism. Additionally, a federated edge intelligence model was designed uti- 
lizing the CA-MobileNet v3 network and the FedAvg federated aggregation algorithm. In the study on split edge intelli- 
gence, a split edge intelligence model named MobileNet-LSTM was designed by integrating the MobileNet v3 network 
with a fusion collaborative attention mechanism and the Bi-LSTM network. [Results and Discussions] Through compara- 
tive experiments with MobileNet v3 and MobileNet-LSTM, the federated edge intelligence model based on CA-Mo- 
bileNet v3 achieved an average Precision rate, Recall rate, F\-Score, Specificity, and Accuracy of 97.1%, 97.9%, 97.5%, 
98.3%, and 98.2%, respectively, yielding the best recognition performance. [Conclusions] It is provided a real-time and 
effective method for monitoring cow ruminant behavior, and the proposed federated edge intelligence model can be ap- 
plied in practical settings. 
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O Introduction 


The timing and intensity of rumination activities 


in cows are crucial metrics for assessing their daily be- 


0] 


havioral patterns’. Continuous and real-time monitor- 
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ing of rumination activities is beneficial for maximiz- 


ing animal welfare and farm productivity” 


. Currently, 
the primary methods for monitoring rumination activi- 


ties in cow are contactless, utilizing machine vision 
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and wireless sensor technology". 


Wearable sensor- 
based monitoring systems have gained increasing pop- 
ularity due to their cost-effectiveness and ease of inte- 
gration with wireless networks. The most commonly 
used sensors include sound sensors, pressure sensors, 
and velocity sensors. Sound monitoring technology 
identifies cow rumination activity by analyzing the 
sounds produced during the rumination process. How- 
ever, sound recognition has a restricted detection range 
and is vulnerable to interference from noisy environ- 
ments, which can compromise system efficacy. The 
use of hydraulic tubes in pressure sensors to collect 
cow information can affect their comfort and is suscep- 
tible to damage, potentially risking leakage of fluids. 
Velocity sensor can overcome the limitations of sound 
sensors and pressure sensors. Tani et al. ™ utilized a 
monitoring system equipped with a single-axis acceler- 
ation sensor. This system extracts feature patterns 
from feeding and rumination to distinguish jaw move- 
ments and then matches similar feature patterns in 
unanalyzed activities. However, the accuracy of re- 
cording chewing signals is affected by the attachment 
position of the sensor. Vazquez Diosdado et al."! de- 
veloped a decision tree algorithm that utilizes three-ax- 
is acceleration data collected from sensors mounted on 
the cow's neck to distinguish behaviors of lying, stand- 
ing, and eating. Benaissa et al.® also fixed three-axis 
acceleration sensors on the cow's neck to gather data 
and devised a simple decision tree algorithm to identi- 
fy eating and rumination behaviors. Shen et al." con- 
ducted further research on identifying cow rumination 
behavior using data obtained from three-axis accelera- 
tion sensors. Hou"! proposed a deep learning model 
based on cow activity data to recognize cow rumina- 
tion behavior, building on machine learning tech- 
niques. In all the studies mentioned above, the raw da- 
ta collected by sensors need to be transmitted to a 
backend system for processing, making it challenging 
to achieve real-time monitoring of cow rumination be- 
havior. Additionally, the transmission of a large vol- 
ume of data results in higher energy consumption and 
shorter battery life for the sensors. 

The progression of edge computing is accelerat- 
ing the trend of shifting from cloud computing to the 
edge. Edge computing has become a solution that 
moves cloud services closer to the network edge, clos- 
er to data sources, and Internet of Things (IoT) devic- 
es". Edge intelligence, utilizing both edge comput- 


ing and artificial intelligence technologies, enables the 
deployment of artificial intelligence (AI) algorithms at 
the network's edge, that analysis and aggregation oc- 


cur near the data capture points!” 


. Edge intelligence 
primarily consists of federated edge and split learning 
models. The federated edge intelligence model is 
achieved by deploying federated learning in wireless 
edge networks"™”. This model consists of multiple ter- 
minal devices and edge servers, collaboratively train- 
ing an AI model across multiple nodes. Models trained 
locally on terminal devices are aggregated at the edge 
server, enabling terminal devices and edge servers to 
share the AI model"“. Split edge intelligence is based 
on split learning techniques, dividing deep learning 
models into sub-models and performing distributed 
training at the edge. It splits the AI model into two 
parts, with one part deployed on terminal devices near 
the data input layer, and the other part trained on edge 
servers", 

With the increasing popularity and development 
of IoT devices in the context of smart farming, cattle 
farms are now producing a wealth of data. The capabil- 
ity to process data in real-time at the terminal nodes is 
crucial for these farms. However, terminal devices 
have limited battery life and computational capabili- 
ties, making it a challenge to independently manage 
tasks that are both energy- and computation-inten- 
sive. Edge computing is an emerging computing 
model that allows for computation to be executed at 
the network edge. It supports compute-intensive real- 
time monitoring applications in resource-constrained 
cattle farm environments. This approach significantly 
alleviates the load on network bandwidth and cloud da- 
ta centers, reducing latency and energy consumptio in 
computation"”'*'. Devices upload data to edge servers 
physically located nearby, offloading compute-inten- 
sive and energy-intensive tasks to edge servers. This 
effectively reduces energy consumption on terminal 


devices and enables real-time processing of tasks'”. 


120] 


Bu and Wang“ introduced a smart agricultural land 
system based on deep reinforcement learning, incorpo- 
rating the concept of edge computing. This system 
comprises agricultural data acquisition, edge comput- 
ing, agricultural data transmission, and cloud comput- 
ing layers. Shen et al.”" fused three-axis accelerome- 
ter into edge computing equipment to collect cow ru- 
mination information. 


To enable the computational tasks to be complet- 
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ed in real-time at the edge, near the cow data source, 
thereby reducing data transmission volume and net- 
work latency, leveraging the advantages of wearable 
devices and edge computing, wearable six-axis sensor 
devices was employed in this research to collect infor- 
mation on cow rumination activities and deep learning 
algorithm was combined to realize the recognition of 
rumination behavior. 


1 Materials and methods 


1.1 Experimental data collection 


The experiment was conducted at the Acheng Ex- 
perimental base of Northeast Agricultural University 
from May 20 to June 20, 2022. The experimental sub- 
jects were 10 healthy Holstein cows in the non-lactat- 
ing period. Six-axis sensors were used as the terminal 
devices for six-axis acceleration data collection, with a 
sampling frequency of 5 Hz. The total duration of the 
dataset is over 180 hours. The terminal devices were 
installed on the collars worn by the cows as shown in 
Fig. 1. The cows were separated in two separate barns, 
with each cow occupying a fenced enclosure made of 
iron bars. The experiment involved feeding the cows 
with a ratio of 3 : 7 of concentrate feed and ryegrass 
hay twice a day, and providing them with an adequate 
water supply. Each cowshed was equipped with an in- 
frared night vision camera installed 1.5 m in front and 
1.7 m above the ground, totaling 10 cameras to serve 
as a verification system. These infrared night vision 
cameras were synchronized with their respective termi- 
nal devices in terms of clock time, enabling the contin- 
uous recording of cow activities throughout the experi- 
ment, which were then stored in the cloud. Through 
monitoring video footage, it was observed manually 
that cows exhibited continuous chewing and swallow- 
ing actions while at rest, which was considered as the 
occurrence of rumination behavior. 

The perception module of the terminal device 
consists of the MPU6050, which is a 6-axis sensor that 
integrates a 3-axis MEMS gyroscope and a 3-axis 
MEMS accelerometer. It can be connected to third-par- 
ty digital sensors through the I2C interface and com- 
municate with all registers of the device. The wireless 
transmission module also utilizes Narrow Band-Inter- 
net of Things (NB-IoT) technology to reduce power 
consumption of the terminal device by transmitting da- 
ta through network aggregation. The edge server is lo- 
cated at the edge layer and is responsible for handling 


Fig. 1 Cow rumination behavior monitoring experimental field 


and sensor wearing position 


service requests through the rational deployment and 
allocation of computing and storage capabilities at the 
network edge. The Nvidia Jetson AGX Xavier is cho- 
sen as the edge server. 

Tensorflow 2.3 was chosen as the deep learning 
framework for the edge server, and Python 3.6 was em- 
ployed as the development language. The edge server 
underwent a firmware flashing process using JetPack 
4.2 initially. Subsequently, Miniforge was installed, 
and environment variables were configured. Finally, 
the deep learning framework Tensorflow 2.3 was in- 
stalled. 


1.2 Data processing 


1.2.1 Pose analysis and calculation 


By obtaining the cow's posture information at the 
current moment, a better understanding of the cow's ru- 
mination behavior can be achieved. Pose analysis and 
estimation were performed using three-axis accelera- 
tion and three-axis angular velocity data. 

1) Selection of the posture coordinate system and 
definition of the posture angles. The calculation of the 
posture angles for cows required a coordinate system 
transformation, commonly using the body-fixed coor- 
dinate system (b-frame) and the navigation coordinate 
system (n-frame). In this study, local Cartesian coordi- 
nates coordinate system was chosen as the navigation 
coordinate system. The data collection from the termi- 
nal device was performed in the body-fixed coordinate 
system, while the posture calculation was done in the 
navigation coordinate system. Therefore, a coordinate 
system transformation was required. The coordinate 
transformation diagram is shown in Fig. 2. Since the 
body-fixed coordinate system is fixed to the cow, it 
changes with the cow's neck posture, thus, the cow's 
posture changes can be expressed by the coordinate 
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transformation matrix from the navigation coordinate 
system to the body-fixed coordinate system, as shown 
in Equation (1)"". 

T=. C'T” (1) 

Where C}? is the coordinate transformation matrix 
from the navigation coordinate system to the body- 
fixed coordinate system; T” is the initial state of the 
body in the navigation coordinate system represented 
as the attitude vector; T° is the attitude vector of the 
body after the posture change in the body-fixed coordi- 
nate system. 


Zn(Zb1) 


Ly Yo(Yb2) 


Note: X, pointed to the east; Y, pointed to the north; Z, was vertical, 
pointing upward with respect to the horizontal plane. The body-fixed co- 
ordinate system was attached to the cow's neck, with X, pointing to the 
right of the body; Y, pointing forward; Z, pointing upward, perpendic- 
ular to the plane formed by X, and Y,. There are three posture angles, 
namely pitch angle (6), roll angle (g), and yaw angle (y), which repre- 
sent the orientation of the cow relative to the ground in the cow coordi- 
nate system. 


Fig. 2 Coordinate schematic diagram of dairy cow 


posture transitions 


In this study, the update of cow posture transfor- 
mation matrices was accomplished using quaternion 
methods. This choice was made because quaternion 
methods could offer advantages over Euler angle meth- 
ods when dealing with the rotation of the cow's pos- 
ture, particularly in pitch angle 0 induced a deviation 


of a5 Unlike Euler angles, quaternion methods do 


not encounter issues with singularities and gimbal 
lock, and they provide a comprehensive representation 
of posture information in all directions. Additionally, 
quaternion posture calculations involve lower compu- 
tational complexity and enable real-time updates of 
posture angles. 

2) Fusion attitude estimation based on the Kal- 
man filtering. Using a single sensor, either an acceler- 
ometer or a gyroscope to estimate attitude angles can 
lead to errors”. Therefore, a Kalman filter algorithm 
was employed for data fusion to achieve attitude esti- 


mation and mutual complementation of multiple sen- 
sors. This approach aimed to address noise interference 
and obtain the optimal estimation of attitude angles. 

The corresponding attitude angle information was 
calculated using the three-axis acceleration measured 
by the accelerometer, and the observation update equa- 
tion of the system was established. The angular veloci- 
ty update equation was then established using the in- 
formation from the three-axis angular velocity and 
serves as the state update equation in the Kalman filter- 
ing (KF) process. The state update equation is given in 
Equation (2). 


0, 1 -dth 0, w,dt QO, 
u 1 Laws J*L°6 S 
Ao, 0 1 JLA@,_, 0 w, 


Where 0, is the optimal estimation of attitude an- 
gle at time k; 0,_, is the optimal estimation of attitude 
angle at the previous time step; œ, is the measurement 
value of the gyroscope, Aw, is the priori estimate of 
gyroscope error at time k; Aw,_, is the optimal estima- 
tion of gyroscope output error; œ, and œ, are the nois- 
es of the attitude system. The observation update equa- 
tion of the system is established using the correspond- 
ing attitude angle information calculated from the 
three-axis acceleration measured by the accelerometer, 
and it is given in Equation (3). 


6: =[1 oll% Jar, (3) 


Where 0, is the attitude angle information at time 
k; v, is the measurement noise of the attitude system. 

The structure diagram of the attitude angle esti- 
mation based on the Kalman filter algorithm is shown 
in Fig. 3. By iterative computation, the data fusion of 
the attitude detection system is performed, and the op- 
timal estimation of the attitude angles is ultimately ob- 
tained. By collecting three-axis acceleration data, the 
three-axis angular velocity data was calibrated and 
compensated to calculate the optimal pitch angle 0 and 
roll angle o. 

Fig. 4 shows the pitch angle @ and roll angle 9 of 
a cow during rumination, comparing the experimental 
results obtained with and without the using of Kalman 
filtering for attitude estimation. Through comparative 
analysis, it can be observed that Kalman filtering re- 
duces the fluctuations in the estimated attitude angles. 
The fusion of multisensor data based on Kalman filter- 
ing yields better results for attitude estimation com- 
pared to using a single sensor alone. 
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Attitude 
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Attitude angle pb 


Attitude angle 0 


Fig. 3 Block diagram of attitude Angle solution using the Kalman filter algorithm 
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Fig. 4 Comparison diagram of attitude estimation experiments 


with and without the Kalman filter 


1.2.2 Feature extraction and selection 


To comprehensively explore the statistical charac- 
teristics of the data, both time-domain and frequency- 
domain features were extracted from the six-axis data. 
SMV(Support Vector Machine) and attitude angles un- 
dergoed Fourier transformation to obtain frequency 
spectrum energy, frequency domain entropy, and the 
DC component as frequency-domain features. Fre- 
quency spectrum energy and frequency domain entro- 
py reflected the energy consumption of rumination be- 


haviors”” 


. The DC component represents the magni- 
tude of the first component obtained after performing 
a fast Fourier transform. It is related to the reverse 
gravity acceleration and the components in the x, y, 


and z axes, thus reflecting the attitude of the cow's 


neck. 

For the six-axis data, a data feature set consisting 
of 96 dimensions was extracted, which included both 
time-domain and frequency-domain features. The fea- 
ture information is presented in Table 1. The six-axis 
data was segmented, and a continuous and non-over- 
lapping sequence of 288 frames was selected as the 
minimum processing unit. Feature extraction was per- 
formed within groups using a sliding window of 
length 16 frames. The sliding window has a window 
length of 3, resulting in a dataset of 96 frames. After 
feature extraction, the data set has an input dimension 
of 96x96x1. 


1.3 Federated edge intelligence model 


1.3.1 Improved MobileNet v3 with Co-attention 
mechanism 

The deployment of deep neural network models 
typically requires high-performance computing hard- 
ware support. However, edge devices have limited net- 
work resources compared to cloud servers, necessitat- 
ing the selection of lightweight neural networks as the 
base network for research on cow rumination behavior 
recognition methods. In 2017, the Google team pro- 
posed the MobileNet network, a network specifically 
designed for lightweight convolutional neural net- 
works, aimed at enabling neural networks to be de- 
ployed on edge devices such as mobile devices and 
embedded systems. MobileNet constructs lightweight 
neural networks through depthwise separable convolu- 
tions, significantly reducing model parameters and 
computational requirements with a minor decrease in 
accuracy compared to traditional convolutional neural 
networks. MobileNet has evolved into three versions: 
vl, v2, and v3, becoming an essential tool for neural 
network applications on mobile devices. 

MobileNet v3 incorporates the Squeeze and Exci- 
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Table 1 Time_domain and frequency_domain feature extraction information for six_axis data 


Feature types Feature quantity 


Feature description 


Minimum value 9 
The first quartile 
Median value 
Third quartile 
Maximum value 
Mean value 
Root mean square 
Standard deviation 
Mean absolute deviation 
Coefficient of correlation 
Spectral energy 


Frequency domain entropy 


w w w a OO 0 0 0 0 Oo 0 


Direct current component 


Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity, SMV, attitude Angle 
Triaxial acceleration, triaxial angular velocity 

SMV, attitude Angle 

SMV, attitude Angle 

SMV, attitude Angle 


tation SE attention in some bneck blocks to improve 
model accuracy by increasing the weights of salient 
feature channels. However, SE attention ignores posi- 
tional information and only considers internal channel 
information. In this study, the CA attention was intro- 
duced after the depth wise convolution, which fol- 
lowed the inverted residual module within the bneck 
structure, thereby forming the CA-bneck. As shown in 
Fig. 5, the CA-bneck structure embedds positional in- 
formation into channel attention, thereby reduces addi- 
tional computational costs and further enhances Mo- 
bileNet v3's focus on key features. 


Fig. 5 Structure diagram of CA-bneck 


The constructed dataset of size 96x96x1 was 
used as the input for the CA-MobileNet v3 network. 
The structure of the improved fusion cooperative atten- 
tion mechanism in the CA-MobileNet v3 network is 
shown in Table 2. The CA-MobileNet v3 network per- 
formed downsampling through convolutional stride op- 
erations without pooling operations. 


Table 2 Structure of CA-MobileNet v3 


Input Operation Exp size Output CA NL Step length 
96x96x1 conv2d,3X3 — 16 — HS 2 
48x48x16 bneck,3x3 16 16 V RE 2 
24x24x16 bneck,3x3 72 24 — RE 2 
12x12x24  bneck 3x3 88 24 — RE 1 
12x12x24  bneck,3x3 96 4 V BS 2 
6x6x40 bneck,5x5 240 40 Vv HS 1 
6x6x40 bneck,5x5 240 4 V BS 1 
6x6x40 bneck,5x5 120 4 V HS 1 
6x6x48 bneck,5x5 144 4 VvV BS 1 
6x6x48 bneck,5x5 288 9 V HS 2 
3x3x96 bneck,5x5 576 % V HS 1 
3x3x96 bneck,5x5 576 % V HS 1 
3x3x96  conv2d, 1x1  — 576 V HS 1 
3x3x576 pool, 3x3 1 
1x1x576 B ’ — 104 — HS 1 
xí; 
1x1x1 024 E o = 1 — = 1 


Note: Exp size represents the number of 1x1 expansion convolutional 
kernels in the inverted residual structure; CA indicates whether the im- 
proved CA-bneck is used; NL represents the type of activation function; 
HS denotes the usage of h-swish; RE represents ReLU; NBN indicates 


that no batch normalization operation is performed. 


1.3.2 Federated edge intelligence model based 
on CA-MobileNet v3 

The construction of the federated edge intelli- 
gence model involved two main entities: terminal de- 
vices and edge servers. Terminal deviced collaborate 
to train and share a lightweight neural model, while 
the edge server was responsible for collecting local 
model parameters sent by the terminal devices and per- 
forming aggregation. In this study, the terminal de- 
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viced and edge server shared the fused cooperative at- 
tention mechanism of CA-MobileNet v3 and perform 
collaborative training using distributed cow data sam- 
ples available on the terminal devices. 

The federated edge intelligence system in this 
study consisted of M terminal deviced and a base sta- 
tion equipped with an edge server. The base station 
was connected to the terminal devices through a wire- 
less channel. The training process of the federated 
edge intelligence model based on CA-MobileNet v3 
follows the flowchart shown in Fig. 6. The learning 
process of the federated edge intelligence model typi- 
cally involved four steps: 


-m | g 
a g-o o eal 5. 
T œ @©70 Para, 
r = hd OF ~A e 
Local CA-MobileNet ‘bay ag Plong 


"r p . ah . 
Terminal equipment e 


1 
Law) = — 
Non 


=_ Q > 
=- oO oe SEE 
= -~ oo dame 
© i 
Global parameter delivery 
Local CA-MobileNet 
Terminal equipment 


OTa 
SS, 8 Og 5 
; — ww. d oH" awt 
>) w o © a ant 
k as oe c 00 av ¥ 


Local CA-MobileNet 
Terminal equipment 


~ 
To, i 
F elih) (t 
: © O or parameter upload ( 


1) The edge server initialized the relevant parame- 
ters, and the terminal devices download the global 
model as their initial local models; 

2) The terminal deviced train their local models 
using the collected real-time local data and upload the 
model parameters; 

3) The edge server collected the local model pa- 
rameters from each terminal device and performs ag- 
gregation to update the global model; 

4) The above three steps were repeated until the 
global model converges, and the edge server deploys 
the updated model parameters to each terminal device. 


M 


1 
minL(w) = = > Nm Lin (Ww) 


m=1 


1) 


6690 


Global CA-MobileNet 


Edge server 


Fig. 6 Flow chart of training of federated edge intelligent model 


Through interactions with cattle ranch users, ter- 
minal devices obtain labeled training samples. These 
samples are used as the local dataset as Equation (4). 

O21 (An) Fein) eo eee) | (4) 

where x‘, represents the features of the i-th train- 
ing sample of the m-th terminal device, yi, represents 
the corresponding sample label indicating whether it is 
rumination or not, and n,, denotes the number of train- 
ing samples owned by the m-th terminal device. The 
training objective of federated learning is to mini- 
mize the global loss function L(w). When the CA-Mo- 
bileNet v3 with the fused cooperative attention mecha- 
nism is deployed on the terminal devices, the local 
loss function of the m-th terminal device on its local 
dataset Q 


global loss function is defined as shown in Equa- 


is defined as shown in Equation (5), and the 


m 


tion (6). 
1 Am i i 
Law) =~ Dua tW Xn Yn) Ym (5) 
1 
L(w) =>) nin Lnlw) (6) 


In federated edge intelligence systems, edge serv- 
ers collect local model parameters from terminal devic- 
es and to refine the global model through aggregation. 
In order to improve the performance of federated learn- 
ing, the strategy of averaging the parameters was ad- 
opted, as implemented in the FedAvg algorithm pro- 
posed by Mc Mahan et al’'., as shown in Equation (7). 


M lQ, m 
w= mer TO] w’ (7) 


Where |Q,,| represents the data size of the local 
dataset on terminal device m; |Q| represents the total 


m 


data size; w;" represents the local model parameters of 
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terminal device m in the r-th communication round; w, 
represents the global model parameters at this time. 


1.4 Split edge intelligence model 


The convolutional recurrent neural network 
(CRNN) combines the strengths of convolutional neu- 
ral networks (CNN) and bidirectional long short-term 
memory (Bi-LSTM) networks to handle tasks that in- 
volve sequential data’*’”. The MobileNet-RNN model 
based on the idea of CRNN was proposed. The model 
consists of three parts: a convolutional neural network 
module, a recurrent neural network module, and a ful- 
ly connected module. The CNN module was com- 
posed of a lightweight CNN, CA-MobileNet v3, 
which incorporates a fusion collaborative attention 
mechanism. Firstly, CA-MobileNet v3 was used to ex- 
tract features from the six-axis data of cows, reducing 
computational complexity. Secondly, the extracted fea- 
tures were fed into a Bi-LSTM layer, followed by a 
fully connected module to recognize the rumination 
behavior of cows. The fully connected layer consisted 
of one fully connected layer and one Softmax classifi- 
cation layer. To avoid overfitting, a batch normaliza- 
tion (BN) layer was introduced after the Bi-LSTM out- 
put and the fully connected layer. Finally, the Softmax 
classification layer was used to recognize whether 
cows were engaged in rumination behavior. The com- 
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putational process is illustrated in Equations (8) 
and (9). 


peor (8) 
No +E 
O =0(W,R + dy) (9) 


Where R= {(7,)", (r3) (r,) (Fu )"} repre- 
sents the output results of the Bi-LSTM; R denotes the 
results after batch normalization processing. 

To implement split edge intelligence, the Mo- 
bileNet-LSTM model was proposed, and the overall 
network structure is depicted in Fig. 7. The MobileNet- 
LSTM network comprised a CNN module, a RNN 
module, and a Fully Connected module. The CNN 
module is deployed on the edge server. CA-MobileNet 
v3 extracts features from the input dataset, ultimately 
obtaining a sequence of cow behavior features through 
feature dimension reduction. This sequence of behav- 
ior features was then fed into the RNN module, which 
consists of a Bi-LSTM network. The Bi-LSTM net- 
work captured temporal correlations in cow behavior 
data. The fully connected module included two fully 
connected layers and a Softmax classification layer. 
BN was introduced after each fully connected layer to 
prevent overfitting. Finally, the softmax classification 
layer was employed to recognize cow rumination be- 
havior. 


Rumination 


1 Softmax 
results 


4 ; Exes ah = —- recognition 


Fig. 7 Structure diagram of MobileNet-LSTM network 


To build a distributed edge intelligence model, 
the MobileNet-LSTM was divided into two parts us- 
ing split learning techniques. A splitting layer was in- 
troduced between the convolutional neural network 
module and the recurrent neural network module. The 
shallow network and deep network were trained sepa- 
rately. The shallow network, which consists of the con- 
volutional neural network module, was deployed on 
the terminal devices for feature extraction. It extracts 


useful features from the raw input data. The deep net- 
work, including the recurrent neural network module 
and the fully connected module, was deployed on the 
edge server. The deep network performed fusion learn- 
ing among features to extract more complex higher-or- 
der features. As shown in Fig. 8, the training process 
of the split-based edge intelligence model based on 
MobileNet-LSTM typically involves five steps. 

1) Initialization: Each terminal deviced and the 
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edge server initialize their respective network models; 

2) Data collection and forward propagation: Ter- 
minal deviced collect data and perform forward propa- 
gation until the splitting layer, obtaining the output fea- 
tures of the splitting layer. These features were then 
uploaded to the edge server; 

3) Forward and backward propagation at the edge 
server: The edge server receives the feature data, per- 
forms forward and backward propagation, and obtains 
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the gradients of the splitting layer. It sent these gradi- 
ents to all terminal devices; 

4) Backward propagation at terminal devices: Ter- 
minal devices used the gradients of the splitting layer 
to perform backward propagation; 

5) Iteration: Terminal deviced and the edge server 
iteratively execute the above steps until the model con- 
verges. 


Edge server 


Layer of = 
cutting 


Deep network 


Cutting layer forward 
propagation 


Cutting layer back 


propagation 


Fig. 8 Training flow chart of split edge intelligence model 


1.5 Performance evaluation metrics 


The performance evaluation metrics calculated 
based on the recognition results were effective for 
measuring the uncertainty between the predicted class 
and the true class, aiming to evaluate the classifier's 
performance. In real-time cow rumination behavior 
recognition tasks, a test data input was assigned to one 
and only one predefined class, allowing for clear defi- 
nitions of true positive (TP), false positive (FP), true 
negative (TN), and false negative (FN). Therefore, per- 
formance can be evaluated using metrics such as Preci- 
sion, Recall, F,-Score, Specificity, and Accuracy. The 


formulas are shown in Equations (10)~(14). 
TP 


Precision = me (10) 
TP 
Recall = TP + FN ( 11) 


2 x Precision X Recall 
Precision + Recall 


F, - Score = (12) 


TN 
Specificity = TN 4 FP (13) 


TP + TN 


TP + TN + FP + FN i 


Accuracy = 


2 Results and analysis 


2.1 Test results of federated edge intelli- 
gence model 


In the experiment, a preprocessed dataset that 
conforms to the input requirements of the neural net- 
work model was selected for training. The dataset was 
divided into training, validation, and testing set in a ra- 
tio of 6:2:2. The dataset includes approximately 5 mil- 
lion preprocessed six-axis acceleration data points, 
with around 3 million data points in the training set 
and approximately 1 million data points each in the 
validation and test sets. The CA-MobileNet v3 net- 
work was utilized to construct the real-time cow rumi- 
nation behavior recognition model. During the model 
training, the cow rumination recognition model was 
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trained on the edge server using the training set. The 
validation set was used for parameter adjustment and 
further training until the model converges. In the mod- 
el test, the cow rumination recognition model obtained 
during the training phase based on CA-MobileNet v3 
was deployed on both the terminal devices and the 
edge server. The testing set was then used to evaluate 
the rumination recognition results. 

To validate the role of the collaborative attention 
module in improving rumination recognition accuracy, 
a comparison was made between the MobileNet v3 
and CA-MobileNet v3 model. The performance evalu- 
ation metrics of both models are presented in Table 3. 
From the data in the table, it can be observed that CA- 
MobileNet v3 exhibits a relative increase of 5.3% in 
precision and 2.7% in accuracy compared to Mo- 
bileNet v3. This suggested that the integration of col- 
laborative attention enhances the model's performance 
in cow rumination recognition tasks. 


Table 3 Performance comparison of MoblieNet v3 and CA- 


MoblieNet v3 model of federated edge intelligence model 


Performance metrics MoblieNet v3 CA-MoblieNet v3 
Precision!% 90.5 95.8 
Recalll% 91.6 93.6 
F -Score/% 91.1 94.7 
Specificity/% 94.5 97.7 
Accuracy/% 93.5 96.2 


The federated edge intelligence model leveraged 
data to complement each other, effectively increasing 
the effective data volume. It achieved this without the 
need to centralize cow data samples, enabling efficient 
feature extraction and utilization. During the training 
process of the federated edge intelligence model, the 
following settings were applied: Training Batch-size 
16; number of Epochs 125; Initial Learning Rate 
0.001; number of Terminal Devices Participating in 
Federated Learning System 10. 

In the federated learning process, each terminal 
device uploaded the weight parameters of its local 
model to the edge server after receiving the new glob- 
al model. The federated center server was hosted on 
the edge server, equipped with an NVIDIA GTX 
1080Ti GPU, an Intel Core i7-7800X CPU, 64GB of 
RAM, and 11GB of VRAM. 

Comparative experiments were conducted by set- 
ting different values for I (the number of iterations for 
local training on terminal devices) and analyzing the 


accuracy and loss curves shown in Fig. 9. From the 
loss curve, it can be observed that as I increase, the 
model's convergence speed becomes faster. The accu- 
racy curve revealed that as I increase from 1 to 6, the 
accuracy of the converged model gradually improves. 
However, when I increase from 6 to 10, the accuracy 
of the converged model shows a declining trend. This 
is because increasing the value of I results in more lo- 
cal iterations of gradient descent training on terminal 
devices, leading to faster convergence and improved 
accuracy. However, an excessively large I value leads 
to too many local training iterations between two mod- 
el aggregations, causing overfitting to local data and 
weakening the effectiveness of the FedAvg federated 
aggregation algorithm. 


—i-3 — is Ins —i-3 — i 1-5 
— Fl 


a. Accuracy curves b. Loss curves 


Fig. 9 Accuracy curve and Loss curve of training models with differ- 


ent I values of CA-MobileNet v3 


In summary, setting I = 6 achieves the best recog- 
nition performance, and the corresponding perfor- 
mance evaluation metrics are shown in Table 4. From 
Table 4, it can be seen that the federated edge intelli- 
gence model based on CA-MobileNet v3, after under- 
going federated learning, exhibits improvements in 
performance metrics. Specifically, it shows a 4.6% in- 
crease in recall and a 2.4% increase in accuracy. These 
experimental results indicate that the CA-MobileNet 
v3 model based on federated learning can effectively 
enhance the recognition of cow rumination behavior. 


Table 4 Performance comparison results of federated learning 


non-federated learning 


Performance metrics federated learning model 


model 
Precision!% 95.0 97.1 
Recall!% 93.3 97.9 
F -Score/% 94.1 97.5 
Specificity/% 97.2 98.3 
Accuracy!% 95.8 98.2 


2.2 Test results of split edge intelligence 
model 


The split edge intelligence model was trained in a 
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supervised manner. To validate the effectiveness of the 
proposed method, ablation experiments were conduct- 
ed between CA-MobileNet v3 and MobileNet-LSTM. 
Both models were configured with the same training 
parameters: a batch size of 16 125 epochs, and an ini- 
tial learning rate of 0.001. Ablation experiments on 
CA-MobileNet v3 and MobileNet-LSTM resulted in 
loss and accuracy curves for both training and testing, 
as shown in Fig. 10. 


Accuracy 


- zad 

25 s0 75 100 125 25 50 75 100 125 
Epoch Epoch 

— Mobile-LSTM ——— CA-mobilenct v3 Mobile-LSTM ~ CA-mobilenet v3 


a. Accuracy curves b. Loss curves 
Fig. 10 Comparison of training curves between CA-MobileNet 


v3 and MobileNet-LSTM of split edge intelligence model 


As seen in Fig. 10a, the MobileNet-LSTM, based 
on the split edge intelligent model, achieves a final rec- 
ognition accuracy of 96.2%. This was an improvement 
compared to the 95.8% accuracy achieved by CA-Mo- 
bileNet v3. The higher accuracy of MobileNet-LSTM 
was attributed to its fusion of contextual information 
related to cow behavior, which enhances the recogni- 
tion accuracy. From Fig. 10b, it can be observed that 
MobileNet-LSTM converges faster compared to CA- 
MobileNet v3. This is because MobileNet-LSTM in- 
corporates a BN layer, which speeds up the training of 
the network. 


2.3 Experimental contrastive analysis 


The federated edge intelligence model based on 
CA-MobileNet v3 and the split edge intelligence mod- 
el based on MobileNet-LSTM were selected for com- 
parative experiments. The performance indicators of 
the experimental results are shown in Table 5. 

The federated edge intelligent model based on 


Table 5 Comparison of the performance of federated and split 


edge intelligence models 


Federated edge intelli- 
gence model 


Split edge intelli- 


Performance metrics/% 
gence model 


Precision 97.1 95.8 
Recall 97.9 93.7 

F -Score 97.5 94.8 

Specificity 98.3 97.7 
Accuracy 98.2 96.2 


CA-MobileNet v3 outperforms the split edge intelli- 
gent model based on MobileNet-LSTM, leading to fur- 
ther improvements in performance metrics. This is be- 
cause the federated learning-based model utilizes local 
training and uploads model parameters, aggregates the 
parameters to update the global model until conver- 
gence, and can extract deeper cow rumination behav- 
ior features. On the other hand, the split edge intelli- 
gent model required substantial intermediate data 
transmission between the terminal devices and the 
edge layer, which may result in data loss and a de- 
crease in recognition accuracy. 

Fig. 11 presents the experimental results in the 
form of box plots for MobileNet v3, CA-MobileNet 
v3, the federated edge intelligent model, and the split 
edge intelligent model. Through experimental compari- 
sons, it can be observed that the federated learning- 
based CA-MobileNet v3 network, i.e., the federated 
edge intelligent model, not only improves the recogni- 
tion accuracy of rumination behavior but also led to a 
more concentrated data distribution. Compared to the 
split edge intelligent model based on MobileNet- 
LSTM, the federated edge intelligent model based on 
federated learning and CA-MobileNet v3 exhibited a 
more stable and concentrated prediction distribution, 
with fewer outliers. This indicates that the model has 
greater reliability and effectiveness in recognizing cow 
rumination behavior. Although the method proposed 
by Shen et al." achieved a recall of 94.3%, which was 
better than that of this research, it was at the cost of 
transmitting large amounts of data and increasing 
equipment energy consumption. In contrast, although 
the performance indices of this study have declined 
slightly, this method have reduced the amount of data 
transmission and cloud computing to achieve real-time 
cow rumination recognition in an environment with 
low network performance. 


3 Conclusions 


In this study, based on the concept of edge com- 
puting, proposed the use of edge devices to capture 
and process real-time six-axis acceleration signals of 
cows. By integrating a cooperative attention mecha- 
nism into the MobileNet v3 network, the CA-Mo- 
bileNet v3 network was introduced. The federated 
edge intelligent model was subsequently constructed 
utilizing the CA-MobileNet v3 network in conjunction 
with the FedAvg model aggregation algorithm. 
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Fig. 11 Boxplot of experimental results distribution for 
MobileNet v3, CA-MobileNet, Federated edge 


intelligence model, and Split edge intelligence model 


Experimental findings reveal that the proposed 
CA-MobileNet v3 network enhances precision by 
5.3% compared to the MobileNet v3 network, while 
the FedAvg federated aggregation algorithm boosts the 
recall rate by 4.3% within the federated edge intelli- 
gent model, underscoring the efficacy of the proposed 
federated edge intelligent model. Furthermore, leverag- 
ing the CA-MobileNet v3 network and the Bi-LSTM 
network, a split edge intelligent model based on Mo- 
bileNet-LSTM was designed, and comparative experi- 
ments were conducted between the federated edge and 
split edge intelligent models. The experimental results 
that the federated edge 
achieves the best recognition performance, with aver- 


show intelligent model 


age Precision, Recall, F\-Score, Specificity, and Accu- 
racy for cow rumination behavior reaching 97.1%, 
97.9%, 97.5%, 98.3%, and 98.2%, respectively. 

The propoesd edge intelligence model not only ef- 
fectively expands the dimensionality of cow data sam- 
ples but also improves the accuracy of cow rumination 
behavior recognition. 
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